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Abstract 

In this study we show that current state-of-the-art synthetically 
generated fingerprints can easily be discriminated from real finger- 
prints. Tests are conducted on the 12 publicly available databases of 
FVC2000, FVC2002 and FVC2004 which are important benchmarks 
for evaluating the performance of fingerprint recognition algorithms; 
3 of these 12 databases consist of artificial fingerprints generated by 
the SFinGe software. We propose a method based on extended minu- 
tiae histograms which can distinguish between real and synthetic prints 
with very high accuracy. This 'test of realness' can be applied to syn- 
thetic fingerprints produced by any method. The connection to the 
knowledge about the biological formation process of finger patterns is 
discussed and suggestions for the improvement of synthetic fingerprint 
generation are given. Two additional application areas for extended 
minutiae histograms are considered: identification and quantifying the 
weight of fingerprint evidence in court. 
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1 Introduction 

Today, many commercial, governmental and forensic applications rely on 
fingerprint recognition for verifying or establishing the identity of a person. 
Among these, methods building on minutiae matching play an eminent role. 
Usually, matching routines are not only tested on real fingers, but in order 
to provide for theoretically unlimited sample sizes, synthetic fingerprint gen- 
eration systems such as SFinGe by Cappelli et al. ( [3] ) have been developed 
in the past. Independently, methods have been developed to reconstruct fin- 
gerprints from minutiae templates (cf. ^3]). Both methodologies are very 
relevant in many application areas: 

• Constructing synthetic fingerprint images facilitates the cheap creation 
of very large databases for testing and comparing the performance of 
algorithms in verification and identification scenarios. 

• It provides ground truth data for evaluating the performance of foren- 
sic experts [42j as well as minutiae extraction algorithms. 



• 



It can improve the matching performance of low-quality and latent 
fingerprints [lOj. 



• Fingerprint reconstruction can be a building block for solving inter- 
operability problems, e.g. on comparing fingerprints acquired from 
different sensors. 

• Research in this area raises the awareness for aspects of security, pri- 
vacy and data protection bearing in mind that an attacker may utilize 
existing techniques for creating a spoof and prepare a presentation 
attack. 

• Mixing prints of two or more real fingers for generating virtual iden- 
tities, obscuring private information or creating cancelable templates 
[39]. 

Synthetic prints should be as real as possible pertaining to all properties 
and features which are relevant for fingerprint recognition, especially with 
respect to their minutiae distribution. Otherwise, a human may be fooled 
by the look of a synthetic print, but their eligibility e.g. for evaluating 
fingerprint recognition algorithms may be challenged and results obtained 
on artificial databases would be insignificant. 

A unifying concept of the 'correct' minutiae distribution. Finger- 
print synthesis and fingerprint reconstruction have been treated for a long 
time as different tasks. This can be well conceived on the background that 
the issue of realistic minutiae distributions has only played a subordinate 
role in theoretical model building and practical research. In our contribu- 
tion we provide for a simple method to assess minutiae distributions of single 
fingerprints as well as of samples of fingerprints. We demonstrate that after 
training, this allows to decide that minutiae patterns of synthetically gener- 
ated fingerprints are not 'correct'. In particular, we believe that including 
realistic minutiae distributions leads to a unified concept in which synthetic 
fingerprint generation and fingerprint reconstruction are in fact two sides of 



the same coin. We will return to this point in Section 6.1 



Fingerprints of fingerprints: Minutiae histograms (MH) introduced 
below assign any given fingerprint image a fixed length feature vector. This 
feature vector is not only highly potent to discriminate real fingerprints from 
fingerprints synthetically generated by the current state of the art system, 
in a preliminary study we additionally demonstrate that this new feature 
vector is also highly discriminatory among real fingerprints as well. Given 
reliably extracted minutiae and sufficent overlap of latent fingerprint images, 
MH allow for fast and effective matching. It seems that MH promise a high 
potential awaiting to be yet unleashed. 



1.1 Construction and Reconstruction of Fingerprints 

As made clear above, in our contribution we focus on the role of minutiae 
distributions. This issue is currently gaining momentum in the scientific 
community, cf. [10] [21} I48j . We begin our exposition with the biological 
principles governing minutiae formation and their distribution which to date 
are still not satisfactorily understood [22l [20] . 

Fingerprint formation guided by Merkel cells. Kiicken and Cham- 
pod propose a model for fingerprint formation [21j that has two major in- 
fluence factors: growth forces which create mechanical compressive stress 
|22l [23] on the one hand, and Merkel cells rearranging from a random ini- 
tial configuration into lines minimizing the compressive stress and inducing 
primary ridges on the other hand. Merkel cells interact with each other 
in reaction-diffusion systems of short range attraction and long range repul- 
sion [12] . Based on empirical evidence from embryonic volar tissue evolution 
(e.g. [1]), Kiicken and Champod let solutions of suitable partial differential 
equations propagate from three centers: one along the flexion crease, one 
at the volar pad (the core area) and one from the nail furrow. Based on 
specific parameter choices, this process eventually forms a ridge line pattern 
featuring minutiae. Kiicken and Champod analyzed the images resulting 
from simulation runs of their growth model and discovered qualitative dif- 
ferences in comparison to real fingerprints. They conclude that with respect 
to the natural variability of arrangements of minutiae 'only an empirical 
acquisition of genuine fingerprints will provide an adequate source of data' 



SFinGe. In a nutshell, growing fingerprint patches containing no minutiae 
are generated starting from a number of randomly located points by the 
iterative application of Gabor filters according to a previously generated 
global orientation field |46| and a non-constant ridge frequency pattern. 
Whenever patches meet, minutiae are produced whenever necessary for the 
consistency of the global ridge pattern. A detailed description of the SFinGe 
method by Cappelli et al. can be found in [3j , Chapter 6 in [33] and Chapter 
18 in [4I]- 

In fact, the SFinGe model silently assumes biological hypotheses of fin- 
gerprint pattern formation that are slightly different from the ones described 
above. First, fingerprint patterns no longer propagate from a well defined 
system of three original sources but rather from a multitude of sources at 
random locations. Secondly, a main governing principle for minutiae cre- 
ation lies in the compatibility of ridge patterns whenever growing patches 
touch. These hypotheses in itself are very interesting and can be viewed as 
intuitively natural; and their validity can be assessed with our methodol- 
ogy. Our result yields, however, that these hypotheses explain the process 



of minutiae formation not satisfactorily. 

Shortcomings of the SFinGe model have been observed by Zhao et al. 
|48| who attribute them to the lack of control over the generation of features 
such as minutiae. They propose to synthesize fingerprints based on statis- 
tical feature models. It is unclear, whether any currently existing model 
adequately describes the distribution of minutiae in real fingers. However, 
this study provides a 'test of realness' which can also be applied to the 
synthetic images according to 



Creating forensic fingermarks from real prints. The approach by 

Rodriguez et al. [42j, since the minutiae are based on real fingers, does 
not suffer from unrealistic minutiae configurations. It is however subject to 
increased time, money and data protection constraints. Forensic fingermarks 
(latent fingerprint images) are simulated in a semi-automatic way: fingers 
of volunteers are recorded using a livescan device. During a period of about 
30 seconds, each person performs a series of predefined movements which 
results in fingerprint images with various distortions. Images are captured 
at a rate of four frames per second. Minutiae are automatically extracted 
followed by manual inspection and correction of false extracted or missed 
minutiae. Starting with this ground truth data set, latents are simulated 
using a region containing a cluster of 5-12 minutiae from the real fingerprint 
images. 

One use case for these simulated latents is to test the minutiae marking 
performance human experts which can be evaluated against the avaivable 
ground truth information. Another application area is benchmarking the 
identification performance of AFIS software. 

Reconstruction Various researches have shown that fingerprints can be 
reconstructed from minutiae templates J5l Il3t [TOl [28] . There are two major 
approaches for the automatic reconstruction of fingerprint images: first, the 
iterative application of Gabor Filters as in SFinGe, and second, the usage 
of amplitude- and frequency- modulated (AM-FM) functions [231 ES, [26] . In 
both cases, the first step is the estimation of an orientation field which fits 
the minutiae pattern. Fingerprint reconstruction from minutiae templates 
and the generation of synthetic images follow similar principles. However, 
the goal in the reconstruction scenario is to produce the same number of 
minutiae at the same locations and with same direction and type as in the 
template. 

1.2 Plan of the Paper 

In the next section, we introduce extended minutiae histograms (MHs): 
statistics of pairs of two minutiae are used in combination with minutiae 



type information and interridge distances to obtain a fingerprint of finger- 
prints. In Section [3j extended MHs are applied for classifying a fingerprint 
into one of the two categories, real or synthetic. Tests on the 12 publicly 
available databases of FVC2000, FVC2002 and FVC2004 show the discrim- 
inative power of this approach. In Section |4j the suitability of minutiae 
histograms for identification purposes is investigated and in Section [5] MHs 
are proposed for the quantification of fingerprint evidence. Section [6] con- 
ludes with a discussion and suggestions for the generation of more realistic 
synthetic fingerprints. 

2 Extended Minutiae Histograms 



d 


distance in pixels between two minutiae. 


a 


directional difference between two minutiae directions. 


miRD 


global mean interridge distance of a fingerprint. 


VIRD 


global variance of interridge distances. 


Pbif 


percentage of bifurcations in a template. 



Table 1: Extended 2D-MH feature overview. 

Extended MHs comprise 
(i) 2D-MHs or 4D-MHs featuring a histograms of minutiae pairs 
(ii) interridge distances 
(iii) the percentage of bifurcations and endings in templates 

2.1 2D-Minutiae Histograms 

Here, we consider 2D-minutiae histograms, because they carry sufficient 
power to discriminate between real and synthetic prints. We construct a 
two-dimensional frequency histogram by computing the distance d between 
minutiae locations in pixels and directional difference a in degrees of the 
two minutiae directions for all combinations of two minutiae in a template. 
Both features are binned using identically sized, equidistant intervals. 

Figure [l] (c) to (f) shows four histograms with 10 x 10 bins. Here, 
distances di < dmax = 200 pixels are divided into intervals of 20 pixels 
ranges (distances increase from top to bottom) and directional differences 
ai into 10 bins of 360° total range. Each directional difference bin consists 
of two intervals of 18° range and differences > 180° are mirrored into the 
corresponding bin, e.g., ai = 10° and a2 = 350° are portioned into the same 
bin, as = 170° and 04 = 190° are grouped into one bin and 05 = 90° and 
as = 270° are also indexed into the same bin. In Figure [I] (c) to (f), the bins 
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Figure 1: (a) shows a minutiae template extracted from a real finger of 
FVC2000 DBl Set III and (b) from a synthetic print of FVC2000 DB4. (c) 
visualizes the derived 2D-MH for the template in (a), and correspondingly 
(e) displays the histogram of (b). The average histogram of FVC2000 DBl 
Set I is visualized in (d) and for the synthetic fingers of FVC2000 DB4 in 
(f). The EMD (see Section [2I]) from (c) to (d) is 0.66 and (c) to (f) 1.79. 



Therefore, the template is correctly classified as belonging to a real finger. 
The EMD from (e) to (d) is 1.69 und from (e) to (f) is 0.61, and consequently, 
(e) is correctly recognized as stemming from an artificial finger. In (c-f), 
distance bins are displayed from top to bottom, directional difference bins 
from left to right. A high brightness value corresponds to a high number of 
occurrences in a bin. 
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Figure 2: Mean histograms of mutual m,inutiae distances between and 80 
pixels (each bin is 8 pixels wide) over the databases of FVC2000, FVC2002 
and FVC2004- Left column: means of real fingers (DBl, DB2 and DB3 
combined). Right colunn: mean over synthetic fingerprints simulated with 
SFinGe (DB4)- Top row: counts of minutiae of mutual absolute angle dis- 
tance £ [0,18°]. Bottom row: counts of minutiae of mutual absolute angle 
distance G [172°, 180°]. The height of the bar corresponds to the mean per- 
centage of minutiae pairs sorted into that bin. 



of first column on the left are centered at 0°. A detail of the first and last 
column is displayed in Figure [2j Highlighting the differring distributions, 
we have chosen dmax = 80 pixels. 

In order to obtain a sensor-independent and age-independent descriptor, 
the minutiae templates are rescaled to prints at 500 DPI. The templates 
of FVC2002 DB2 (569 DPI) and FVC2004 DBS (512 DPI) are demagnified 
accordingly and the rescaling is taken into account during the estimation 
of the interridge distances listed in Table |4j The rescaling facilitates a fair 
comparison with synthetic prints which are designed to produce fingerprint 
images of approximately 500 DPI. 

The description of FVC2000 DB3 [3D] states that one third of the 19 
volunteers were under 18 years of age. These prints could be easily enlarged 
to the predicted adult print size at 500 DPI using the scaling factor attained 
in [H], if additional information were available (the age and whether the 
fingerprints was acquired from a male or female person). Since children 
and adolescents have smaller fingers, the average ridge frequency and the 
distances between two minutiae are smaller and applying the same bin limits 
as for adults to the unsealed templates of still growing persons results in 
different statistics. As a consequence, the classification could be even better, 
if this additional information was available. 

The distance between two 2D-MHs is computed by the Earth Mover's 
Distance (EMD) which measures the distance between two distributions by 



d„ 



Maximal distance in pixels between two minutiae. 



bdist 
'^dir 

r 

S 



^ of bins for distances in pixels. 

^ of bins covering differences between two minutiae dirctions. 
Costs for moving mass along neighboring directional bins. 
Costs for moving mass along neighboring distance bins. 



Exponentiation factor for costs. 



Table 2: 2D-MH parameter overview. 



computing the minimal cost for transforming one distribution into the other. 
This metric is especially useful for comparing histograms [29J and has many 
applications, e.g. in content-based image retrieval [Hj. The concept was 
first described by G. Monge in 1781 and is also known as Mallows distance, 
Wasserstein distance or Kantorovich-Rubinstein distance. 

The cost matrix of the transport problem consists of the composite costs 
for moving mass along the distance bins and for moving mass along the 
directional difference bins. For example, moving mass m from distance bin 
X und directional bin u to distance bin y and directional bin v results in the 
following costs c: 

c = m ■ {{s • \x — y\y + {r ■ \u — v\Y) 

In our implementation, we compute the EMD and solve the transport prob- 
lem by applying the auction algorithm of Bertsekas \1\ . For the classification 
of fingerprints into real or synthetic, an average 2D-MH is computed for each 
of the two classes (using a set of minutiae templates described in Section [s]) . 
The average 2D-MHs act as representatives for its respective class. All 2D- 
MH are normalized such that the sum of the masses in all bins amounts 
to 1. 

For a minutiae template which shall be classified into real or synthetic, 
first, the 2D-MH of the template is computed and it is normalized. Next, 
the EMD between the 2D-MH of the unknown class and the average 2D-MH 
of the real fingers is computed as well as and the EMD to average 2D-MH 
of the synthetic prints. The minutiae template can be classified by simply 
choosing the class with the smaller EMD (see upper half in Table Wh or both 
EMDs can used in combination with additional features (see lower half in 
Table [7|. 

For identification, we use unnormalized intersection distances instead, 
cf. Section m 

2.2 4D-MHS 

Minutiae properties not considered so far in the 2D-MH are the angle of the 
relative position of the second minutia with respect to the first and the local 



combination of minutiae types. Minutiae types are currently accounted for 
only as a global value (percentage of bifurcations in a template). 

In order to classify a print as real or synthetic, this additional information 
is not necessary, but for other applications, it can be useful and should be 
considered. The 2D-MHs can easily be augmented by these additional two 
dimensions giving 4D-MHs. 

In Section |4j the usage of 4D-MHs in an identification scenario is studied 
and in Section [5} we propose to apply the empirical distribution of extended 
4D-MHs for quantifying the weight of fingerprint evindence in court. 

2.3 Minutiae Type 





DB 1 


DB 2 


DB 3 


DB4 


FVC2000 


38.8 


43.6 


40.2 


30.0 


FVC2002 


38.1 


37.8 


36.3 


29.2 


FVC2004 


46.5 


37.3 


55.8 


32.1 



Table 3: Average percentages of bifurcations in minutiae templates for real 
(DB 1-3) and synthetic (DB 4) fingerprints. 

Table [3] lists the average percentages of bifurcations in minutiae tem- 
plates for each of the FVC databases. Some fluctations among the databases 
containing real fingerprints can be observed which can be attributed to 
various causes including image quality and its influence on the automatic 
minutiae extraction, the usage of different sensors with different properties 
including the size of the captured finger surface, the variability and distri- 
bution of minutiae in the fingers of the volunteers whose fingerprints were 
acquired for building the databases. However, we notice that the artificial 
fingerprints generated by SFinGe tend to have systematically lower percent- 
ages of bifurcations which is presumably caused by the image fabrication 
process. Therefore, this percentage is included as a feature for the proposed 
classification of a fingerprint as real or synthetic. 

2.4 Interridge Distances 

The interidge distance image H assigns each pixel a local estimate of the 
distance between two neighboring ridges (or two valleys). For prints acquired 
from adults at a resolution of 500 DPI, the value of ^(x, y) is in the range 
from 3 to 25 pixels (forensic researchers have shown that women tend to have 
on average slightly smaller interridge distances than men (see |17l I36j )). 

The interridge distances listed in Table [4] are estimated by curved regions 
as described in [13j. The orientation field estimate for constructing the 
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FVC2000 DBl (real finger) 
FVC2000 DB2 (real finger) 
FVC2000 DBS (real finger) 
FVC2000 DB4 (synthetic) 
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Estimated global mean interrldge distances per Image 



FVC2002 DBl (real finger) 
FVC2002 DB2 (real finger) 
FVC2002 DB3 (real finger) 
FVC200 2 DB4 (synthetic) 
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Estimated global mean interrldge distances per Image 



FVC2004 DBl (real finger) 
FVC2004 DB2 (real finger) 
FVC2004 DBS (real finger) 
FVC20 Q4 DB4 (synthetic) 









8 9 10 11 

Estimated global mean interrldge distances per Image 



Figure 3: Global mean and variance of individual interrldge distances per 
image for all images of FVC2000, FVC2002 and FVC2004 estimated by 
curved regions [13j. Systematic differences between images acquired from 
real fingers (green) and synthetic images (red) are clearly visible. 
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curved regions is obtained by using a combination of the line sensor method 
|15j and a gradients based method as described in [16] . 

We observe that the synthetic fingerprints are marked by lower estimated 
mean interridge distances and a lower variance of the interridge distances 
in comparison to images of real fingers (see Table HI). Figure p^ shows the 
distribution of the global mean interidge distances and the global variance 
of interridge distances for all images of FVC2000, FVC2002 and FVC2004. 





Mean interridge distance 


Variance 




DBl 


DB2 


DBS 


DB4 


DBl 


DB2 


DB3 


DB4 


FVC2000 


9.24 


9.07 


10.53 


7.60 


3.53 


2.87 


3.85 


2.15 


FVC2002 


8.85 


9.00 


8.88 


7.34 


2.63 


2.73 


3.35 


1.58 


FVC2004 


9.15 


9.40 


8.80 


7.90 


3.46 


3.86 


4.06 


2.17 



Table 4: Mean and variance of interridge distances. 



3 Separating the Real from the Synthetic 

3.1 Training and Test Protocol 

Tests are conducted on the publicly avaivable fingerprint competitions of 
FVC2000, FVC2002 and FVC2004 ^EDISS]. Each competition consists of 
four databases: for the first three databases, fingerprints were acquired from 
volunteers using different sensors. The fourth database contains synthetic 
fingerprints created by the SFinGe software. 

All databases with real and synthetic prints contain images from 110 
fingers with 8 impressions per finger. We diveded each set of 110 fingers 
into three independent, non-overlapping sets (see Table [s]). 



Set I 


Finger 1 to 40 


Computing an average template 


Set II 


Finger 41 to 70 


Parameter training 


Set III 


Finger 71 to 110 


Testing the classification performance 



Table 5: Set overview. 

On set I, the average template is computed which acts as a representative 
for its class (real for DB 1-3 and synthetic for DB 4). A few combinations 
of the parameters listed in Table [2] and weights for linear feature fusion are 
trained on set II and the configuration which leads to the best classification 
is chosen for the test on set III. 

The features are fused into a combined score s which is computed as 
s = wo + wi-a + 'W2-b + 'W3-c + 'W4- d, where a is the histogram EMD differ- 
ence, b the normalized global mean of interridge distances, c the normalized 
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global variance of interridge distances and d the normalized percentage of 
bifurcations in a template. The weights Wi are trained on set II. 





FVC2000 


FVC2002 


FVC2004 


DBl 


DB2 


DBS 


DB4 


DBl 


DB2 


DBS 


DB4 


DBl 


DB2 


DBS 


DB4 


A 


B 


C 


D 


E 


F 


G 


H 


I 


J 


K 


L 


A 





0.02 


0.06 


1.11 


0.03 


0.03 


0.10 


0.68 


0.03 


0.05 


0.04 


0.44 


B 







0.05 


1.11 


0.03 


0.04 


0.11 


0.68 


0.02 


0.05 


0.03 


0.43 


C 









1.10 


0.06 


0.08 


0.10 


0.68 


0.05 


0.03 


0.03 


0.44 


D 











1.11 


1.11 


1.12 


0.58 


1.11 


1.11 


1.11 


0.81 


E 













0.03 


0.11 


0.68 


0.02 


0.06 


0.04 


0.44 


F 















0.11 


0.68 


0.04 


0.07 


0.06 


0.44 


G 

















0.71 


0.11 


0.08 


0.10 


0.48 


H 



















0.68 


0.69 


0.69 


0.29 


I 





















0.05 


0.04 


0.43 


J 























0.03 


0.45 


K 

























0.45 


L 




























Table 6: EMDs between the average 2D-MHs per database for all combina- 
tions of databases (cf. Figure |4]). 



3.2 Results 



Feature: 2D-MH 



FVC2000 



FVC2002 
FVC2004 



DB 1 vs. DB 4 DB 2 vs. DB 4 DB 3 vs. DB 4 



Set II Set III Set II Set III 



93.3 



86.7 
81.7 



90.0 



95.0 



83.8 



55.0 



90.0 

81.7 



86.3 
67.5 



Set II Set III 



^.3 



85.0 
90.0 



95.0 



78.8 
70.0 



Features: 2D-MH, interridge distances 
and percentage of bifurcations 



FVC2000 


100.0 


92.5 


100.0 


87.5 


100.0 


97.5 


FVC2002 


100.0 


97.5 


100.0 


95.0 


100.0 


90.0 


FVC2004 


95.0 


72.5 


90.0 


83.8 


98.3 


97.5 



Table 7: Classification performance (correctly classified templates in per- 
cent). Distances between minutia histograms are measured using EMD. 



The results on all available FVC databases show that the proposed 
method by extended 2D-MIIs is able to separate real from synthetic prints 



13 



d 










d ■ 








D 


o 
d 


t 








CM 

d ■ 




L 


H 




d ' 















G 










O - 














a 














CM 
O 

d 










J 














K 


C 


CM 
O _ 

d 


F 




A 

E 


B 







-0.2 0.0 0.2 0.4 0.6 0.8 



-0.06 



-0.02 



0.02 



0.06 



(a) 



(b) 



Figure 4: Two-dimensional visualization by multidimensional scaling (MDS) 
of EMD distances of mean 2D-MHs as reported in Table pi (a): Real and 
synthetic fingerprints. The 2D-MHs originating from synthetic prints are 
labelled as D, H, L. (b): Separate MDS visualization for the mean 2D-MHs 
of the real fingerprints only. 

with very high accuracy. On the training sets of all databases of FVC2000 
and FVC2002, the classification performance of the combined feature set 
was 100%, and for the corresponding test sets, the performance was in the 
range from 87.5 to 97.5%. On the whole, the image quality in the databases 
of FVC2004 is clearly lower compared to the quality of images in previous 
competitions. Hence, it it is more challenging to avoid errors during the au- 
tomatic extraction of minutiae from theses images. We used a commercial- 
of-the-shelf software for minutiae extraction. 

The good discriminative power of 2D-MHs alone is not surprising upon 
inspection of a 2D visualization by multidimensional scaling (e.g. [34, Chap- 
ter 14]) of the mutual 2D-MH distances from Table p^ in Figure Hi In the 
left display, MHs from real fingers cluster in the middle of the left side while 
MHs from synthetic fingerprints come to lie in the upper right (for the year 
2000) and closer to the center towards the bottom (for the years 2002 and 
2004). In fact, we can see that the algorithms leading to minutiae forma- 
tion have obviously undergone changes between the years 2000 and 2004, 
however, only moving the MHs moderately closer to 'realness'. 

4 Identification 

Up to this point, we applied minutiae statistics for classifying a fingerprint 
as real or synthetic. In this section, we explore the potential of this idea for 
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Figure 5: In 597 out of 770 searches on FVC2000 DB2, the right finger 
ranked first. However, in a few cases a considerable part of the database 
had to be accessed in order to retrieve the template belonging to the query 
finger. In this example, the search with impression four (left) of finger 105 
resulted in a small intersection with the minutia histogram of same finger 
(impression one, right). Missing and spurious minutiae as well as a small 
overlap between the aligned templates deteriorate the performance for the 
displayed query. 
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identifying individuals. 

The approach described previously based on minutiae histograms al- 
lows to represent a fingerprint as a fixed-length feature vector. Obtaining 
a fixed-length feature vector representation marks the grand goal for bio- 
metric template protection schemes and has previously been achieved by Xu 
et al. in their spectral minutiae representation 1471 . It also appears highly 
promising for an identification scenario in which a large database is searched 
for a query fingerprint. 

In a first test on FVC2000 DB2, the proposed method achieved average 
access rates (average part of the database that is accessed using an incremen- 
tal search strategy until the corresponding finger is found) between 2.33% 
and 4.01% using a 4-dimensional histogram with 20 bins for differences be- 
tween minutiae directions, 20 bins for Euclidean distances between minutiae 
locations, 20 bins for the angle of the relative location of the second minu- 
tiae with respect to the first and 4 bins for minutiae type combinations. We 
chose this database, because it was used in tests measuring the indexing per- 
formance by other researchers: De Boer et al. tested three different features 
(orientation field, finger code, minutiae triplets) and their combination on 
this database and they reported rates between 1.34% and 7.27% [9j. Cap- 
pelli et al. proposed minutiae cylinder code and locality-sensitive hashing 
for fingerprint indexing [3] and report an average access rate of 1.72% for 
this database. 

In our test, we used the sum of intersections between corresponding bins 
as a score (BIS = bin intersection score) and sorted the list of fingers in the 
database in descending order. In 597 out of 770 searches, the right finger 
was ranked first in the list. If the search is narrowed down to templates with 
30 or more minutiae, than it ranked first in 393 out of 441 searches. We 
inspected the cases in which a larger portion of the database were accessed 
(see example shown in Figure pi) and found two reasons for this: first, the 
main reason being minutiae extraction errors (missing and spurious minu- 
tiae) which have a negative impact on the score. Missing minutiae reduce 
the intersection between minutiae histograms of templates from the same 
finger. Spurious minutiae can increase the intersection between histograms 
from templates of different fingers. A second reason which has to be inves- 
tigated may be a very small overlap area between the two prints. In this 
context, we note that our BIS has been designed to alleviate small overlaps. 

It is obvious that for a fair comparison between different minutiae-based 
indexing methods, the same minutiae templates have to be used. In doing 
so, the influence of different minutiae extractors on the identification perfor- 
mance can be eliminated and only then results become comparable. It is of 
interest to quantify the impact of different minutiae extractors and different 
fingerprint image enhancement techniques on the identification performance, 
but these comparisons are beyond the scope of this study. This first test 
shows the suitability of minutia histograms for identification purposes and 
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this direction deserves further research. 

5 Fingerprint Individuality and Quantifying Weight 
of Evidence in Court 

Universahty, collect abihty, permanence and uniqueness are properties which 
render an anatomical or behavioral trait useful for for biometric recogni- 
tion |33j . Permanence of the fingerprint pattern was scrutinized by Gal- 
ton [11] more than a century ago and it was later confirmed that the pat- 
tern's development is finalized at an estimated gestational age of 24 weeks [1] . 
Uniqueness of fingerprints is commonly assumed by all researchers and prac- 
titioners dealing with fingerprint recognition. Fingerprint individuality has 
never been proven, but there is a long history of models attempting to ex- 
plain and quantify fingerprint individuality starting with Galton in 1892 
|llj to the present day. Stoney gives an overview over 10 major models for- 
mulated between 1892 and 1999 in Chapter 9 of [27 . For recent additions, 
please see [401 \35 \ [2Tj and the references therein. There two broad categories 
of models: First, mathematical models trying to encompass the distribution 
of features extracted from observed prints. And secondly, biology based 
models about randomness during the formation of friction ridge skin in pre- 
natal development of human life |22l [23t \2U\ I21j . Notwithstanding the lack 
of proof of uniqueness, fingerprints have a long success story in commercial, 
governmental and forensic applications [:45j. 

As a consequence of an on-going reformation process [8], in the future 
forensic experts may have to quantify the weight of evidence with probabil- 
ities and errors rates instead of a binary decision [38] . Minutiae histograms 
and their extension to four dimensions (distance between two minutiae, an- 
gle between minutiae directions, angle of the relative location of the second 
with respect to the first and the minutiae types) can useful for improving 
this quantification e.g. by likelihood ratios [37J. 

A crucial point is that these minutiae statistics have to be based on a 
large number of real fingers. Minutiae should be manually marked or in semi- 
automatic fashion, a human should inspect automatically extracted minutiae 
in order to avoid minutiae extraction errors and their infiuence on the MHs. 
Ideally, interridge distances or related measures like e.g. ridge count would 
be taken into account and additional information would be available for 
minutiae templates, including age, body height, sex and ethnicity [T8]. This 
would enable the computation of more sophisticated statistics, e.g. the 
probability that a print with certain minutiae histogram stems from a person 
of certain age group. 

In summary, minutiae histograms can become a useful tool for forensic 
experts who are requested to quantify the weight of fingerprint evidence in 
court based on empirical ground truth data. 
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6 Conclusion 

The proposed extended minutiae histograms capture and comprise relevant 
information of fingerprints which enable to separate current state-of-the-art 
synthetic fingerprints from prints of real fingers. This study reveals a funda- 
mental difference between natural finger pattern formation and the synthetic 
fingerprint generation process. As a consequence, any results obtained on 
existing databases of synthetic fingerprints should be regarded with caution 
and may not reflect the performance of a fingerprint comparison software 
in a real-life scenario. A performance evalution of FVC2004 showed that 
'the behavior of the algorithms over DB4' (the database consisting of syn- 
thetic prints) 'was, in general, comparable to that on the real databases' 
[S]. However, this can also be interpreted as an indicator that the partic- 
ipating algorithms are not yet optimized for the specifics of the empirical 
distribution of minutiae in real fingerprints. 

6.1 Suggestions on How to Improve the Generation of Syn- 
thetic Fingerprints 

Recall from Section |1.1| that the SFinGe model silently extends the well- 
standing biological hypothesis of fingerprint pattern formation due to three 
converging ridge systems (for a brief discussion, c.f. [23]), to a multitude 
a converging ridge systems starting at random locations and this process is 
the governing principle for minutiae formation. Our work shows that this 
remarkable hypothesis can be tested and the tests show that the minutiae 
pattern formation appears to be a more complex process. It would be of in- 
terest to test fingerprint images generated by [21], [38] and other reseachers, 
using extended minutiae histograms. 

One possibility to improve SFinGe is to modify the fingerprint generation 
process in such a way that the resulting fingerprint images have the following 
properties: 

• Average minutiae histograms of synthetic prints should be indistin- 
guishable from average minutiae histograms computed from a database 
of real fingers. 

• Considering minutiae types, the relation of endings to bifurcations 
should resemble the relation and its distribution observed in real fin- 
gers. 

• The global mean and variance of interridge distances should be similar 
to those of real fingers aquired at the same resolution. 

Two Sides of the Same Coin An alternative option for the creation of 
synthetic fingerprints that pass the proposed 'test of realness' is to consider 
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synthetic fingerprint generation as a reconstruction task. Traditionally, the 
construction of artificial fingerprints is associated with an unknown outcome 
regarding the number, location, direction and type of minutiae, whereas re- 
construction aims at the generation of a fingerprint image which has best 
possible similarity in terms of minutiae properties to a given template. Basi- 
cally, we propose to focus on the generation of a 'realistic' minutiae template 
and all these other things above shall be added unto by the existing recon- 
struction methods. Here is a possible outline of this approach: 

• First, a feasible foreground and orientation field is constructed, e.g. 
using a global model like |46l [TU] which is able to incorporate the em- 
pirical distribution of observed pattern types (Henry-Galton classes) 
and singular points [7]. 

• Secondly, a realistic number of minutiae n is drawn from the empir- 
ical data of minutiae in fingerprints with the previously determined 
foreground size and pattern type. 

• Thirdly, an inital minutiae distribution is obtained by choosing n 
points e.g. randomly on the foreground as minutiae locations and 
for each location, the minutiae direction is set to local orientation 6 or 
6 + 180° by chance. 

• Fourthly, the 2D-MH of the minutie template is computed and it is 
modified iteratively until it passes the 'test of realness', i.e. the EMD 
between the current template and the average 2D-MH of real finger- 
prints is below an acceptable threshold. Modification operations are 
the deletion and addition of minutiae and flipping of the minutiae di- 
rection by 180°. The implementation for the computation of the EMD 
allows to analyze the flow of mass, so that the bins in the 2D-MH can 
be indentified which contribute above the ordinary to the total costs, 
and thus, the minutiae pairs that are 'the most unlikely' in comparison 
to the empirical distribution. 

• Fifthly, minutiae types (ending and bifurcation) are assigned, based 
on the empirical distribution in real finger patterns. 

• Finally, the fingerprint image is reconstructed using e.g. Gabor filters 
or the AM-FM model. The interridge distances are checked for devi- 
ations from the empirical data obtained from real fingerprints and if 
required, the interridge distance image is adjusted and the reconstruc- 
tion step is repeated. 

• Optionally, noise can be simulated for copies of the constructed im- 
age, and if desired, they can be rotated, translated and nonlinearly 
distorted. 
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6.2 Other Research Directions 

Another direction of research is the identification of additional features for 
the discrimination between real and synthetic prints. In the context of 
presentation attack detection, it would be highly desirable to detect a re- 
constructed fingerprint, even if this property cannot be infered from the 
minutiae distribution. We would also like to explore the suitability of minu- 
tiae histograms for security applications, e.g. in combination with the fuzzy 
commitment scheme or the fuzzy vault scheme, and for the generation of 
cancelable templates. 
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